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Abstract 


The  problem  of  simultaneously  estimating  phase  and  decoding  data 
symbols  from  baseband  data  is  posed.  The  phase  sequence  is  assumed  to 
be  a  random  sequence  on  the  circle  and  the  symbols  are  assumed  to  be 
equally-likely  symbols  transmitted  over  a  perfectly  equalized  channel. 

A  dynamic  programming  algorithm  (Viterbi  algorithm)  is  derived  for  de¬ 
coding  a  maximum  a  posteriori  (MAP)  phase-symbol  sequence  on  a  finite 
dimensional  phase-symbol  trellis.  A  new  and  interesting  principle  of 
optimality  for  simultaneously  estimating  phase  and  decoding  phase- 
amplitude  coded  symbols  leads  to  an  efficient  two  step  decoding  procedure 
for  decoding  phase-symbol  sequences.  Simulation  results  for  binary, 

8-ARY  PM,  and  16-QASK  symbol  sets  transmitted  over  random  walk  and 
sinusoidal  jitter  channels  are  presented,  and  compared  with  results  one 
may  obtain  with  a  decision-directed  algorithm,  or  with  the  binary 
Viterbi  algorithm  introduced  by  Ungerboeck.  When  phase  fluctuations 
are  severe,  and  the  symbol  set  is  rich  (as  in  16-QASK),  MAP  phase- 
symbol  sequence  decoding  on  circles  is  superior  to  Ungerboeck' s  tech¬ 
nique,  which  in  turn  is  superior  to  decision-directed  techniques. 


I.  Introduction 


Phase  fluctuations  can  significantly  increase  the  error  probability 
for  coded  or  uncoded  symbols  transmitted  over  a  channel  that  may  or  may 
not  have  been  equalized.  This  is  especially  true  for  PSK  and  QASK  symbol- 
ing  in  which  case  accurate  phase  discrimination  is  essential  for  symbol 
decoding*-.  Even  when  the  receiver  contains  a  decision-directed  phase- 
locked  loop  (DDPLL) ,  performance  loss  in  SNR  with  respect  to  a  coherent 
decoding  system  can  be  in  the  range  5-10dB.  This  fact  is  established  in 
[1]  for  practical  symbol  sets  and  typical  values  of  the  phase  variance 
parameter  and  symbol  error  probability. 

On  telephone  lines  linear  distortion  and  phase  jitter  dictate  the  use 
of  a  channel  equalizer  and  some  kind  of  phase  estimator  to  achieve  high 
rate,  low  error  probability,  data  transmission.  A  common  approach  to 
phase  estimation  and  data  decoding  is  to  use  a  decision-directed  algorithm 
in  which  a  phase  estimate  is  updated  on  the  basis  of  old  phase  estimates 
and  old  symbol  decisions.  The  DDPLL  of  [5]  is  a  first-order  digital 
phase-locked  loop  (PLL)  in  which  the  phase  estimate  is  updated  on  the 
basis  of  a  new  measured  phase  and  an  old  symbol  decision.  In  the  jitter  equal¬ 
izer  (JE)  of  [3]  and  [4]  a  complex  gain  is  updated  according  to  a  simple 
decision  directed  stochastic  approximation  algorithm.  The  complex  gain 
is  used  to  scale  and  rotate  the  received  signal,  thereby  correcting  phase 
jitter  and  normalizing  rapid  fading  variations.  Although  there  is  no 
explicit  interest  in  phase  estimation  itself  in  the  JE,  it  is  possible  to 
interpret  the  structure  as  an  adaptive  gain-phase  correcting  equalizer. 

*The  modifiers  PSK  and  QASK  stand  for  "Phase  Shift  Keyed"  and  "Quadrature 
Amplitude  Shift  Keyed,"  respectively;  SNR  will  mean  signal-to-noise- 
ratio. 
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Both  the  DDPLL  and  the  JE  are  very  simple  to  implement,  but  apparently 
neither  achieves  optimality  with  respect  to  any  statistical  criterion  for 
symbol  (or  data)  decoding.  Furthermore,  neither  the  DDPLL  nor  the  JE  is 
optimum  for  estimating  and/or  correcting  phase.  Therefore  an  important 
question  to  be  answered  is  whether  or  not  symbol  decoding  can  be  improved 
using  a  better  phase  estimator.  The  answer,  based  on  the  results  of  [1] 
and  this  paper,  is  that  significant  improvements  can  be  realized  when  the 
phase  fluctuations  are  severe  if  one  is  willing  to  pay  the  price  of  an 
increased  computational  burden.  In  practice,  cases  of  severe  phase 
fluctuation  can  occur  in  high  data  rate  PSK  and  QASK  systems  in  which  the 
angular  distance  between  symbols  is  small. 

In  [1)  Ungerboeck  recognized  the  potential  of  maximum  a  posteriori 
(MAP)  sequence  estimation  for  jointly  estimating  phase  and  decoding  data 
symbols.  A  path  metric  was  derived  and  its  role  in  a  forward  dynamic 
programing  algorithm  for  obtaining  MAP  phase  symbol  sequences  was  indi¬ 
cated.  Because  of  the  way  phase  was  modelled  in  [1],  the  dynamic  pro¬ 
graming  algorithm  could  not  be  solved  directly.  Using  two  approximations, 
Ungerboeck  derived  an  implementable  algorithm  and  obtained  performance 
results  that  were  on  the  order  of  3dB  superior  in  SNR  to  the  DDPLL  in  a 
16-QASK  system, at  interesting  values  of  the  phase  variance  parameter. 

We  call  the  algorithm  of  [1]  a  discrete  binary  Viterbi  algorithm  (DBVA) . 

The  reader  is  referred  also  to  [5]  and  [6]  for  discussions  of  other 
sub-optimum,  but  computationally  tractable,  algorithms  for  simultaneously 
estimating  phase  and  decoding  data  symbols. 

In  this  paper  we  observe  that  baseband  data  is  invariant  to  modulo-2ir 
transformations  on  the  phase  sequence.  This  motivates  us  to  wrap  the 
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phase  around  the  circle,  so  to  speak.,  and  obtain  folded  probability  models 
for  transition  probabilities  on  the  circle.  When  the  phase  process  is 
norma]  random  walk  on  the  circle,  then  the  transition  probabilities  are 
described  by  a  folded  normal  model.  This  model  has  also  been  used  in 
[7]  and  [8].  It  is  then  straight-forward  to  pose  a  MAP  sequence  estima¬ 
tion  problem  for  simultaneous  phase  and  symbol  sequence  decoding  as 
described  in  [8]  and  [9],  The  basic  idea  is  to  discretize  the  phase 
space  [-ir,ir)  to  a  finite  dimensional  grid  and  to  use  a  dynamic  program¬ 
ming  algorithm  (Viterbi  algorithm)  to  keep  track  of  surviving  phase-symbol 
sequences  that  can  ultimately  approximate  the  desired  MAP  phase-symbol 
sequence.  The  MAP  phase-symbol  sequence,  itself,  is  the  entire  sequence 
of  past  phases  and  symbols  that  is  most  likely,  given  an  entire  sequence 
of  recorded  observations.  Details  of  the  algorithm  are  given  in  [8]  and 
[9].  Por  PSK  and  QASK  symbol  sets  an  interesting  principle  of  optimality 
leads  to  an  efficient  two-step  decoding  procedure.  With  this  procedure 
computational  complexity  is  reduced  by  a  factor  near  to  the  square  of  the 
number  of  admissible  phase  values  per  amplitude  level.  This  amounts  to 
a  factor  of  16  for  the  16-point  QASK  diagram  that  has  been  recommended  by 
CCITT  for  data  transmission  on  telephone  lines  at  9600  b/s.  Finally,  in 
order  to  make  the  computation  and  storage  requirements  tractable  in  the 
Viterbi  algorithm,  we  use  it  in  a  fixed  delay  mode,  as  do  other  authors. 

By  appealing  to  known  results  for  fixed-lag  smoothing  of  linearly-observed 
data,  we  are  able  to  intelligently  choose  the  fixed  delay.  Without  signif¬ 
icant  performance  loss  we  decode  phase-symbol  pairs  at  a  depth  constant 
of  10.  This  obviates  the  need  for  huge  storage  requirements  for  long 
sequences.  With  these  modifications  the  Viterbi  algorithm  becomes  a 
feasible,  albeit  sophisticated,  decoding  procedure. 
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Simulation  results  for  the  proposed  Viterbi  algorithm  (VA)  are  pre¬ 
sented  for  several  symbol  sets  consisting  of  2,  8  or  16  symbols.  Several 
types  of  phase  jitter  are  Investigated  such  as  Gaussian  and  non-Gaussian 
random  walk,  and  sinusoidal  phase  jitter.  The  resulting  error  probabilities 
are  compared  with  those  of  the  simpler  decision-directed  algorithms  (JE  and 
DDPLL),  and  those  of  the  DBVA.  As  expected,  performance  of  the  VA  is 
always  superior  to  that  of  the  other  systems.  On  the  other  hand  the  increase 
in  computational  burden  is  substantial  and  the  Improvement  in  performance 
is  not  always  great  enough  to  warrant  the  use  of  the  VA.  In  our  conclud¬ 
ing  remarks  we  discuss  situations  in  which  one  might  reasonably  use  the 
VA  rather  than  a  simpler  decision-directed  algorithm  such  as  the  JE  or  the  DDPLL. 
Remarks  on  Notation; 

Throughout  this  paper  JJ_  denotes  statistical  independence.  The  nota- 
tion  will  mean  the  set  k*l,2, . . . ,K}.  When  the  indexes  1  and 

K  are  missing  (e.g.,  {$k}),  it  is  understood  that  K  is  infinite.  The 

+  2 
symbol  N  denotes  the  positive  Integers.  The  notation  x:Nx(y,a  )  means 

the  random  variable  x  is  normally  distributed  with  mean  u  and  variance 

2  2  2  -J*  2 

o  ;Nx(y,o  )  will  also  be  used  to  denote  the  function  (2iro  )  exp  {-(x-y)  / 

2  2 

2o  }.  When  x  is  complex,  x:Nx(y,a  )  means  x  is  complex  with  density 
2  2  -I  2  2 

Nx(y,o  )  *  (2wo  )  exp{-|x-y|  /2a  *7.  By  f(x/y)  we  mean  the  conditional 
probability  density  of  the  random  variable  x,  given  the  random  variable 
y.  Thus  f(x/y)  is  generally  a  different  function  than  f(w/z),  even  though 
we  use  no  explicit  subscripting  such  as  indicate  so.  We  make 

no  notational  distinction  between  a  random  variable  and  its  realizations, 
relying  instead  on  context  to  make  the  meaning  clear.  A  density  function 
for  a  random  variable,  evaluated  at  a  particular  realization  of  the  ran¬ 
dom  variable  ic  termed  a  likelihood  function.  "Hatted"  variables  such  as 

A 

refer  always  to  MAP  estimates  that  maximize  an  a  posteriori  density. 
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Finally  it  is  convenient  to  define  the  function 
.  M  “ 

g^(x)  =  M"1  E  E  h[x-i.2ir-(m-l)  2tt/M]  (1) 

m“l  4,=-°° 

where  h(*)  is  a  probability  density.  The  function  g^  (•)  plays  an 
important  role  in  our  discussion  of  phase-symbol  decoding  on  QASK  symbol 


sets. 
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II.  Signal  and  Phase  Models 

Assume  complex  data  symbols  {a^}  are  phase  or  phase-amplitude  modulated 
onto  a  carrier  and  transmitted  over  a  channel  with  linear  distortion  and 
phase  jitter.  The  received  signal,  call  it  y(t),  is  typically  processed 
as  illustrated  in  Figure  1.  The  signal  y(t)  is  passed  through  a  bandpass 
noise  filter  and  demodulated  with  two  quadrature  waveforms.  The  resulting 
complex  baseband  signal  x^(t)  +  jx£(t)  is  equalized  with  a  complex  adap¬ 
tive  equalizer  in  order  to  reduce  the  intersymbol  interference  due  to 
linear  distortion  in  the  channel.  The  equalized  signal  is  a  sequence  of 
samples  at  symbol  rate  1/A  (A  is  the  interval  between  successive  data 

symbols).  The  output  of  the  equalizer  is  a  complex  sequence  x^  =  x^^  + 

(2) 

jx^  which  is  a  noisy,  phase-distorted,  version  of  the  original  trans¬ 
mitted  sequence.  Thus  we  write 

j*k  + 

\  “  ake  +nk’  keN  ‘  (2) 

Here  {a^}  is  the  complex  symbol  sequence,  typically  encoded  according 
to  one  of  the  diagrams  illustrated  in  Figure  2.  The  sequence  repre¬ 

sents  phase  fluctuations  (jitter  and  frequency  drift)  in  the  channel. 

The  two  real  components  and  of  the  complex  noise  sequence 

n^  »  +  are  t*le  n°ise  variables  in  the  respective  baseband 

quadrature  equalized  channels.  The  variables  n^^  and  n^2^  can  be 
shown  to  be  independent  when  the  carrier  frequency  is  in  the  middle  of 
the  input  noise  filter  bandwidth  and  the  additive  channel  noise  is  white. 

If  the  equalizer  is  perfect,  then  n^  is  the  usual  Gaussian,  additive 
noise  with  zero-mean.  If  the  equalizer  is  not  perfect,  then  n^  contains 
a  residual  of  the  intersymbol  interferences,  and  is  not  Gaussian;  nor 
are  successive  variables  n^^ ,  nk+i,"*»  independent.  However,  for  a 
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reasonably  good  equalizer,  we  may  assume  that  (n^)  is  a  sequence  of  inde¬ 
pendent  identically  distributed  (i.i.d.)  complex  Gaussian  variables. 
Strictly  speaking  this  assumption  is  valid  only  at  the  input  to  the 
equalizer  when  the  baseband  equivalent  of  the  input  noise  filter  and  low- 
pass  demodulator  is  the  so-called  sampled,  whitened-matched  filter  of  [10]. 
In  practice  the  assumption  of  Gaussianity  is  more  realistic  than  the 
assumption  of  independence  for  the  sequence  {n^K  Assuming  that  the 
equalizer  of  Figure  1  is  perfect  we  model  the  noise  sequence  {n^}  as 
follows: 


=  n51)  +  jn^2),  keN+ 


nkX)  Uni2)  v<k’*>  (3) 

nkX)  !L\(1),  k  *  ^2)  11  nfc(2),  k  *  l 

n^:  N(0,o2);  n^2):  N(0,o2) 

2  2 
Here  2o^  is  the  variance  of  the  complex  noise  variable  n^  and  is 

the  variance  of  each  real  component. 

Consider  now  the  phase  distortion  {4^}.  The  term  generally  re¬ 
flects  two  effects,  one  long-term  and  the  other  short-term.  In  modem 
high  speed  data  modems  no  carrier  or  pilot  tone  is  transmitted  for 
locking  the  local  oscillator  at  the  receiver.  Thus  long-term, 
large-range  linear  phase  variations  result  from  frequency  drift  in  the 
channel  which  cannot  be  eliminated.  In  addition,  nonlinear  intermodu¬ 
lation  with  local  power  supplies  gives  rise  to  short-term,  small-range 
phase  variations.  The  variations  exhibit  energetic  harmonic  content  at 
the  harmonics  of  the  fundamental  power  supply  frequency.  Hence  a  realis¬ 
tic  model  for  { }  is 
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$  “  (<j>ri+2irBk)  +  E  A.  sin(2irv  kA+p  ),  keN  (l 

where  v^  =  £*50  Hz  or  =  £*60  Hz,  depending  on  the  place  of  use. 

A  typical  phase  process  is  depicted  in  Figure  3.  The  first  term  in 

parentheses  in  (4)  is  the  so-called  frequency  drift  term  and  the 

summation  term  is  the  phase  jitter.  In  practice  the  constants  <jig,  B, 

(A^,  v^,  vary  with  time  kA,  but  at  an  extremely  slow  rate. 

The  spectrum  of  the  phase  jitter,  i.e.  the  behavior  of  A^  vs.  v^ 

has  been  investigated  experimentally  in  [14].  The  spectrum  is  roughly 
2 

fitted  by  a  1/v  curve.  A  phenomenological  model  for  phase  having  a 
2 

1/v  spectrum  (like  that  of  phase  jitter  at  high  frequencies)  is  the 
Wiener-Levy  continuous  time  process, 

-  w(t),  t  >  0,  (! 

where  {w(t)J  is  a  white  noise  process.  The  discrete  time  analog  is 
the  Independent  increments  sequence 

*k’Vi+V  keN+  (6 

where  {w^}  is  a  sequence  of  i.i.d.  random  variables  with  even  probability 
2  2 

density  h(w).  When  w^N^O.o^,  then  is  the  so-called  normal  ran¬ 

dom  walk. 

In  detail  the  model  of  (6)  falls  well  short  of  a  reputable  proba¬ 
bilistic  model  for  phase,  because  at  low  frequencies  the  spectrum  is 
unbounded.  Furthermore  the  spectrum  is  not  integrable,  corresponding 
to  the  unbounded  growth  of  the  variance  in  the  diffusion  model  of  (6). 
However,  in  gross  terms,  i.e.  for  short-term  fluctuations,  the  model 
captures,  with  appropriate  selection  of  h(w),  the  correlated  evolution 
of  phase.  The  main  virtue  of  the  independent  increments  model  is  that 
2 

That  is,  h(w)  »  h(-w). 
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it  forms  a  convenient  basis  from  which  to  derive  estimator  structures 
which  may  then  be  evaluated  against  more  realistic  phase  sequences. 

As  the  measurement  model  of  (2)  is  invariant  to  modulo-27!  translates 
of  we  may  represent  phase  as  if  it  were  a  random  sequence  on  the  unit 
circle  C  or  equivalently  on  the  interval  [  — tt  , tt )  .  Call  this  represen¬ 
tation  of  Note  4^+2  may  be  written 


*K+1 


>k  +  Wk 


(7) 


where  the  +  denotes  modulo-27T  addition  of  real  variables  or  equivalently 
rotation  with  positive  (counter-clockwise)  sense  on  C,  The  variable  w, 


is  a  modulo-27r  version  of  w,  . 

k 


A  - 


The  conditional  density  of  =  ^k  +  Wk’  given  ^^k+l_'*’k^’ 

Since  is  a  modulo-2i7  version  of  we  may  reflect  all  of  the 

conditional  probability  mass  into  C  to  obtain  the  transition  (or  condi¬ 
tional)  probability  density 


f(WV  =  E  h(VrVi2T)  =  g!(VrV 

&=-<» 


(8) 


where  is  the  function  defined  in  (1).  Hereafter  g^(‘)  is  called 
the  folded  density  of  the  phase  increments.  Usually  the  phase  increment 
is  small  and  its  distribution  h(')  is  very  narrow  with  respect  to  2n. 

Therefore,  in  the  sum  of  (8)  only  one  term  is  relevant  and  f(J  ^/$  )  = 

-  -  2 
h(4>,  ).  In  the  normal  case  this  imples  a  <<  2n,  where  a  is  the 

k+1  k  w  w 

variance  of  w^.  As  it  is  cumbersome  to  carry  around  the  overbar  notation 
we  drop  it  with  the  caution  that  from  here  on  <f>k  is  defined  on 
C  unless  otherwise  stated. 

In  the  normal  case  [7],  [8],  the  density  8]_ ma^  wr^tten 
00 

gi(wv =  *  N*  (♦k+t2ir*oi) 

d=-'»  k+l 


(9) 
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This  case  and  the  Cauchy  case  (in  which  the  distribution  tails  are  much 
heavier  than  the  normal  tails)  are  studied  in  Appendix  A.  It  is  shown 
that  g^(x)  achieves  its  maximum  at  x  =  0  and  that  it  is  monotone  decreas¬ 
ing  on  0  <_  x  <_  x . 

The  sequence  is  Markov.  Therefore,  we  may  write  for  the 

joint  density  of  the  K  phases 

’  /,  f<WV  <10> 

k=l 

f ({ 41  x/*^q)  =  :  the  marginal  density  of  $ 

Usually  <)>^  is  uniformly  distributed  on  C,  because  phase  acquisition 

starts  at  k  =  1  with  no  prior  information  about  its  value.  By  the 

independence  of  the  n^  in  (2)  it  follows  that  the  conditional  density 

£ 

of  the  measurement  sequence  {x^}.^,  given  the  phase  and  data  sequences 

(V*  is 

fUx^/U^K.  (ak)K)  -  jj^  »Xk  <akeJV,  «*>  .  (11) 

Equations  (8)- (11)  form  the  basis  for  the  derivation  of  a  MAP 
sequence  estimator.  The  key  element  is  that  {<f^}  is  a  Markov  sequence 
with  a  bounded  range  space  [  — tt  , it )  .  Discretization  of  this  bounded 
interval  leads  to  a  finite-state  model  from  which  a  finite  dimensional 
dynamic  programming  algorithm  can  be  derived. 
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III.  Decision-Directed  Algorithms 
The  usual  way  of  dealing  with  phase  fluctuations  is  to  design  a 
phase  estimator  and  use  the  estimated  phase,  call  it  to  rotate  the 
received  signal  as  follows: 


yk  =  \  e  ",  keN  .  (12) 

The  phase  corrected  signal  y^  is  then  fed  to  a  decision  device 
which,  in  turn,  delivers  the  symbol  estimate  a^.  Typically  the  phase 
estimate  <j>k  is  functionally  dependent  on  the  old  measurements  x^  ^ > 

A  A  . 

and  the  past  symbol  estimates  {...,  ak_2>  afc  .  If  a  carrier 
or  pilot  tone  is  transmitted  as  in  single  sideband  (SSB)  systems,  then 
is  obtained  from  a  simple  phase-locked  loop  (PLL) .  In  suppressed 
carrier  systems  such  as  PSK  or  QASK  systems  the  PLL  is  "decision-directed". 
That  is,  is  updated  on  the  basis  of  a^.^.  For  instance  in  [5] 

A  A  "^k 

*k+l  =  *k  +  6  1  (13) 

4  +k  +  Uk  sln^ar8  ar8  \  ”ik)  .  V»k  -  ulxj2 

where  *  denotes  complex  conjugate  and  p  is  a  constant  that  depends  on 
the  signal-noise  ratio.  The  estimator  of  (13)  is  called  a  DDPLL. 

In  the  jitter  equalizer  (JE)  of  [3]  and  [ 4 ],  is  rotated  and 
scaled  as  follows: 

yk  =  xkGk 

*  *  (14) 

Gk  =  Gk-1  +  \-l 

The  complex  gain  is  the  single  complex  coefficient  of  a  one-coeffi¬ 
cient  rapidly-adaptive  equalizer.  We  may  think  of  G,/|G.  j  as  the 

-J$k 

phase  correction  e  ,  and  |G^|  as  a  gain  correction  c^.  Thus,  al¬ 
though  there  is  no  explicit  formulation  of  a  phase-gain  estimation 
problem  in  [3]  and  [4],  the  net  effect  of  the  JE  is  to  correct  phase 
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and  normalize  rapid  fading  variations.  As  explained  In  [4],  when  phase 
fluctuations  are  large,  the  JE  performance  may  be  Improved  by  setting  a 
constraint  on  that  keeps  Its  value  inside  a  given  domain  including  the 
complex  point  (0,1). 


Geometrical  Comments: 

The  combined  effects  of  random  phase  fluctuations  and  additive  noise 

may  be  illustrated  as  in  Figure  4(a).  The  transmitted  symbol  a^  =  a^ 

j*k 

(say)  is  rotated  by  the  random  phase  angle  4>k  to  give  e  .To  this 
is  added  the  complex  noise  sample  n^  to  give  the  measurement  x^  defined 
in  (2).  For  the  case  illustrated,  the  resultant  measurement  is  closer 
to  symbol  a^^  than  to  a^^  and  consequently,  with  no  phase  or  phase- 
gain  correction,  a  decoding  error  would  be  made.  To  emphasize  the 
combined  effects  of  phase  fluctuation  and  additive  noise,  we  have  illus¬ 
trated  a  case  for  which  either  phase  jitter  or  additive  noise  alone  would 
cause  no  error.  See  [11]  for  a  probabilistic  discussion  of  this  issue. 
Figure  4(b)  is  an  illustration  of  how  a  DDPLL  works.  The  angle  is 
the  noisy  measured  phase  (arg  x^)  minus  the  sum  of  the  phase  of  the  de- 

*  A 

coded  symbol  and  the  previously  estimated  phase  (arg  ajc+<l>jt)  •  A  given 
amount  of  this  angle  is  added  to  as  a  correction  to  get  the  new 


phase  estimate 


t>k  +  Note  that  only  phase  is  corrected.  In 


the  JE  both  phase  and  gain  are  corrected,  offering  potential  for  improved 
performance.  This  potential  is  particularly  important  in  QASK  symbol 


sets  where  amplitude  errors  in  x^  can  result  in  decoding  errors. 
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IV.  Map  Phase  and  Symbol  Sequence  Decoding  with  the  Viterbi  Algorithm 


The  basic  idea  behind  MAP  sequence  decoding  is  to  find  a  sequence 


iK 


of  phase-symbol  pairs  that,  based  on  the  observation  sequence 


appears  most  likely.  The  application  of  this  idea  to  data  commun¬ 


ication  was  first  proposed  in  [1]  and  refined  in  [9].  The  most  likely 


sequence,  call  it  {(j^.a^},  is  the  sequence  that  maximizes  the  natural 


logarithm  (or  any  other  monotone  function)  of  the  a  posteriori  density 


K  K 

of  {^.a^}^,  given  the  sequence  of  observations  Thus  we  pose  the 


maximization  problem: 


max  in  ^k^l^^l^ 


o 


This  is  equivalent  to  maximizing  the  natural  logarithm  of  the  likelihood 


K  K  K 

function  ^k^l’  ^ak^l^’  obtaine(*  evaluating  the  joint  density 


function  for  {x^^,  {<f>k}^» 


and  at  the  observed  values  of 

{x^}^.  Using  the  results  of  (10)  and  (11)  we  may  write: 

K  J* 


<Vi>  ■  k;l  \<v  k-°n>  flVWf(W>5>-  (1 


Assuming  the  {a^}^  to  be  a  sequence  of  independent,  equally  likely 
symbols,  using  (8),  and  neglecting  uninteresting  constants,  we  may 
write  the  maximization  problem  as 
max  rR 

{<f>k}l’  ak  1 


2o2 


A 


U 


V 


k|2 


K 

l 

k«2 


8i(V 


♦k_]_)  +  An 


f(^) 


U 


Note  that  T,  satisfies  the  recursion 
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Fk  =  Fk-1  +  Pk  k  =  2*3*-*' 

Pk  =  -  ~^2  I xk~ake  k|2  +  *n  8i^k”^k-l^ *  k  =  2»3’--*  (18) 

2a 

n 

1  J*,  9 

r,  - o  lxi-aie  I  +  in  f(<M 

2 a  x  1  1 

n 

where  p^  Is  the  so-called  path-metric.  For  convenience,  let  us  make 

explicit  in  T  the  last  phase  and  symbol:  T  (<p  ,a  ).  The  other  argu- 
K.  K  K  K 

K— 1  K-l 

ments  »  remain  implicit.  Then,  from  (18) 

rK^K,aK^  *  rK-l^K-l,aK-l^  +  Pr^’^’^K’^K-I^ 

,  passing  through 

(*K-l,aK-l)  on  its  way  to  ^K’aK^’  must  arrive  at  (iK_i»aK_i>  along  a 

route  ({$fc}k  ^  that  rK-l^K-l’aK-l^  ‘  Ic  is  this 

observation  which  forms  the  basis  of  forward  dynamic  programming.  In 

the  actual  implementation  of  a  dynamic  programming  algorithm,  one  must 

discretize  the  phase  space  C  to  a  finite  dimensional  grid  of  phase 

values  -  *  {£  )m  , .  The  function  in  g,  ($,  — , )  is  then  defined  on  the 
—  n  n=l  &1  rk  k-l 

two-dimensional  grid  -  x  However,  as  discussed  in  [8]  and  [9]  the 
resulting  m  x  m  matrix  of  conditional  probabilities  has  Toeplitz 
symmetry  which  means  only  an  m  vector  of  conditional  probabilities  must 
be  computed  and  stored. 

The  Viterbi  algorithm  for  simultaneous  phase  and  symbol  decoding 
consists  simply  of  an  algorithm  which  determines  survivor  phase-symbol 
sequences  terminating  at  each  possible  phase-symbol  pair.  One  of  these 
surviving  sequences  is  ultimately  decoded  as  the  approximate  MAP  phase- 
symbol  sequence.  The  complexity  c  af  the  algorithm  lies  mainly  in  the 


Thus,  the  maximizing  sequence,  call  it 


new 


I  IC  l  z. 

x^-a^e  |  ,  for  each 

measurement  x,  .  Here  M  is  the  symbolllng  alphabet  size  and  m  is  the 

%,2 

number  of  discrete  phase  values.  For  each  calculation  of  |x^-a^e  | 

there  are  6  real  multiplies.  Compared  to  this  multiplication  load  of 

6mM  per  sample,  the  determination  and  addition  of  the  m  possible  values 

of  )ln  g^(4>, that  appear  in  (18)  is  negligible.  The  determination 
j4>k  2 

of  jx.-a^e  |  would  likely  be  computed  in  a  pipe-lined  parallel  archi¬ 
tecture,  while  the  terms  Hn  g^(>)  would  be  read  by  appropriately  addres¬ 
sing  RIM.  When  there  are  many  symbols  and  short-term  phase  fluctuations 

have  small  amplitude  (o  small) ,  so  that  m  must  be  large  for  accurate 

w 

phase  tracking,  then  the  complexity  is  great.  For  example  with  M=8 

3 

and  m=48,  c  a  (384),  indicating  on  the  order  of  2  x  10  computations 
at  each  k-step. 


As  we  show  in  the  next  section  the  com¬ 
plexity  of  the  Viterbi  algorithm  can  be  dramatically  reduced  by  making  a 
change  of  variable  and  tracking  a  total  phase  variable  that  is  the  sum  of 
<)>k  and  the  symbol  phase,  arg  a^.  And,  of  course,  for  PSK  symbol  sets  M 
may  be  set  to  unity  because  only  one  symbol  amplitude  is  admissible  and 
admissable  symbol  phases  may  be  chosen  to  fall  on  one  of  the  discrete 
phase  values.  Thus  for  PSK  symbol  sets  the  complexity  is  simply  m  and 
the  number  of  path  metric  computations  is  on  the  order  of  300  for 
m*48.  Even  this  figure  may  be  reduced  by  using  one  of  a  variety  of  so- 
called  M-algorithms  in  which  all  survivor  states  are  saved  but  only  a 
handful  of  candidate  originator  states  are  considered  for  each  survivor. 
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V.  A  Principle  of  Optimality  for  Phase-Amplitude  Coded  Symbols 
and  an  Efficient  Two-Step  Decoding  Procedure 
In  order  to  simplify  matters  and  to  illustrate  the  key  ideas,  let 
us  consider  PSK  symbols  of  the  form 

l0k 

®k  "  e 

with  {8^}  drawn  independently  from  an  M-ary  equi -probable  alphabet 
M 

0  *  { (i=l)2ir/M}^=^.  Write  the  measurement  model  of  (2)  as 

J+k 

\  “  6  +nk 


(20) 


(21) 


where  the  total  phase  is  represented  as  follows: 
*k  =  ^k  +  9k 


9k  =  *  AV  A6k  =  ek  -  9w  A01  =  ei 

i=l 


(22) 


It  Is  clear  that  0fc  =  I  A8^  and  <t’jc  “  ^  ~  Thus  we  may  replace 

1=1 

the  MAP  sequence  estimation  problem  posed  in  (15)  by  the  problem 


max 

<“k>? 


<VJ'  <“„>![> 


,K  A 


The  joint  density  f  =  f (.,.,.)  in  (23)  may  be  written 


^  2  k-1  k-1 

n  N„  (e  k,o‘)f(^,A0k/{tj}J  \ 


(23) 


k=l  *k 

where  for  k=l,  f«(^,A0^/. ,.)  is  simply  the  marginal  density  fOfi^.AO^). 
The  conditional  density  on  the  right  hand  side  of  (24)  is  easily 
evaluated  with  Bayes'  rule: 

f(Aek/{«j}J"1,{A0j}^"1) 


(24) 


(25) 


Now  A0fc  is  independent  of  the  previous  data,  additive  noise  and  phase 
fluctuations.  Thus 


“vr>-s 

Moreover  if  we  rewrite  as 

k 

*k "  Vi  +  wk +  °k-i  +  °k  -  ek-i 


=  Vi  +  Aek  +  wk 


(26) 


(27) 


we  see  immediately  that 


fvfvi"1-  <ayi>  -  'I'n-rt'  •  <28> 

Recall  ^  is  defined  on  the  circle  C.  Therefore,  for  clarity  we  might 

think  of  ^  as  a  random  variable  +  Ae,^  +  w^,  whose  density  is  folded  in 

£ 

[-ir.ir).  Putting  (24)-(28)  together,  we  have  for  the  joint  density  f 


fK  = 


n  N  (e 
k=l  \ 


J*. 


k  2.1  , 

0„>  S  *i  Vt 


rMk> 


A0 


1 


(29) 


K  K 

Principle  of  Optimality:  Call  the  MAP  sequences  that 

K  K 

maximize  f  ;  (£0^)-^  enters  only  in  the  g^(*)  term  on  the  right  hand 
side  of  (29).  Now  let  us  suppose  (as  is  usual)  that  g^(w),  which  is 
even,  is  also  cnimodal  with  a  peak  at  w  =  0.  This  single-mode  assump¬ 
tion  for  gj(*)  is  valid  in  particular  when  the  phase  increment  wfc  in 
the  Markov-process  (6)  has  a  Gaussian  or  Cauchy  distribution  h(w). 

See  Appendix  A.  It  follows  that  f  is  maximized  by  choosing 

aSk  -  'VW  (30> 


where  [x]  denotes  the  closest  value  of  (t-l)2n/M  to  x.  By  substitu¬ 
tion  of  the  constraint  (30)  into  (29)  and  defining  the  "rest"  function 
R(x)  on  the  circle  C  by 
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R(x)  =  x  -  [x] 
we  find  that 


(31) 


■'K  ^  9  1 

f  3  k=i  \  <e  •  V  *  gi(R(Wi))  (32) 

AK  K 

The  maximization  of  f  with  respect  of  is  formally  equivalent 

K  K 

to  maximizing  the  joint  density  f^x^l^,  w*len  t*le  total  phase 

J. 

vk  follows  a  Markov-model  similar  to  (6): 

*k  *  *k-l  +  \  (33) 


Here  the  independent  increments  have"probability  density",  folded 
on  the  circle  C, 

f(u)  -  i  8l(R(u))  .  (34) 

This  interpretation  is  purely  formal  since  f (u)  is  not  generally 
a  probability  density.  However  when 

g-^u)  »  0,  |u|  >_  ^  (35) 

then  f (u)  is  a  probability  density  because  in  that  case 

^  g^GO)  =  g^u)  .  (36) 

Thus  (34)  can  be  interpreted  as  an  approximate  density  when 
the  peak  of  g(u)  is  narrower  than  the  minimum  phase  distance  between 
the  symbols.  This  condition  is  always  satisfied  in  communications 
applications.  Otherwise  phase  distortion  is  so  large  that  data 
transmission  is  not  possible.  Thus  we  have  a  pure  phase-tracking 
problem  as  in  [8]  and  [9]  and  we  may  proceed  accordingly.  Taking  the 
natural  logarithm  of  f  we  have  the  maximization  problem: 
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max  Ty 
U  }K 


1  9 

Tu  +  K  ■ r;  ■  -77  've  1  + h(tt(*i» 

za 

n 


72  !ve  kl2  +  ln  gM'W1 

2a 

n 


which  is  solved  by  the  dynamic  programming  algorithm  discussed  in  Section 
IV.  The  complexity  c"  of  this  algorithm  lies  essentially  in  the  evaluation 

J4>K  2 

of  the  m  possible  values  of  |xfc-  |  for  each  new  data  value  x^.  The  m 
different  values  of  S,n  g.[R(*)]  will  be  pre-computed  and  stored  in  ROM. 

7  J*k,2 

For  each  computation  of  |x,  -e  |  there  are  6  multiplies,  so  complexity 


is  simply  proportional  to  m. 


reduction  in  complexity  proportional  to  M  for  M-ary  PSK. 


This  represents  a 


Usually,  the  phase  is  differentially  modulated  rather  than 

A 

directly  modulated  and  therefore  the  relevant  symbol  is  A0^  itself 

(see  (30)).  For  the  purpose  of  data  transmission,  there  is  no  need  to 

«  k 

reconstruct  the  absolute  data  phase  0,  =  T.  Ad..  This  reconstruction 

k  1=1  i 

has,  however,  been  carried  out  in  the  simulations  in  order  to  recover 
the  estimates  ^  -  0^  of  the  phase  fluctuations,  and  to  get  the 

approximate  variance  of  the  phase  estimates 


'  MJ' 


Geometrical  Comments  and  Densities  Galore:  The  entire  development  of 
this  section  has  a  nice  geometric  interpretation  which  we  illustrate 
in  Figure  5.  In  Figure  5(a)  the  basic  phase  noise  density  h(x)  is 
illustrated  on  (-<”,«).  Figure  5(b)  is  the  folded  version  g^(x)  of  h(x) 
to  account  for  the  wrapping  on  the  unit  circle  C.  Figure  5(c)  is  the 
function  g^[R(x)]  that  arises  in  our  discussion  of  the  principle  of 


20 


optimality,  sketched  in  the  case  of  4-ary  phase  modulation.  Figure  5(d) 
shows  g^[R(x)]  wrapped  around  the  circle  C.  Since  g^(x)  is  very  narrow, 
g^[R(x)]  is  approximately  the  repeated  copy  of  g^(x)  at  all  possible 
values  of  data  phase.  With  x  =  ^j^k-l’  ^*8ure  5(d)  illustrates  the 
choice  of  A 8^  nearest  (A0^  =  ir/2  is  the  best  choice  here),  and 

the  resulting  value  of  g^[R($k~<»k_3)  ]  is  shown  by  the  heavy  segment  on 
the  axis  ^  terminated  by  the  heavy  dot. 

We  now  extend  this  principle  of  optimality  to  phase-amplitude  en¬ 
coded  synfcols.  Assume  the  independent,  equally  probable  data  symbols 
are  complex  symbols  of  the  form 

j9k 

\  e  (40) 


with  the  A^  positive  real  numbers  drawn  independently  from  the  alphabet 


A  =  (a. ,  a... 


a^).  Denote  by  p(A^  the  probability  mass  function 


for  the  random  variable  A^.  Assume  the  9^  are  drawn  from  the  alphabet 
B  *  (B^,  $2>***>  Denote  the  conditional  probability  mass  function 

of  0k,  given  by  A^,  by  P^/^)-  For  the  (4,4)  diagram  of  Figure  2, 

A  *  (*^2a^,  3a^,  l/2a^,  5a^);  B  *■  (b^}0_^,  bi  =  (i-1)  The  prob¬ 
abilistic  description  of  the  source  is 
p(A^)  *•  1/4  for  all  A^ 

1/4,  9k  -  B2,  B4,  66,  Bg 


p(6k/Ak  *  °i>  *  t 

0,  otherwise 
p(0k/Ak  -  a3)  -  p(6k/Ak  -  ax) 

0,  *  6*j  *  0e>  ^7 

-<V\  ■  “2>  -  '  "  (  1 

0,  otherwise 
P (ek/Ak  “  °4>  "  P(0k/Ak  "  a2) 


(41) 
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In  place  of  the  maximization  problem  posed  in  (23)  we  write 


max 

i  K  r i  K 


f({\}r  {K}v  {AVi>  {V?> 


(42) 


{AVI’  <Vl 


with  and  A8^  defined  as  in  (22).  The  density  f  (.,.,.,.)  appearing 


in  (42)  may  be  written 


K  jii> 


k  2 


k-1 


ik-1 


nk-lN 


*ml  \  <V  -V  f(VaVY{Vl  '  '“i’l  '  {Aj>l  >  ' 

(43) 

The  conditional  density  on  the  right-hand  side  of  (43)  is  simply 

f(,"k*A0k’Ak/*’-*-)  =  8i(WrAV  p(Aek/Ak,Ak-i)  P(V  <44) 

where  p(A6^/A^,A^  ^)  is  the  conditional  probability  mass  function  for 
A6^,  given  A^  and  A^  Putting  (43)  and  (44)  together  we  have  as  the 
joint  density  function  to  be  maximized 

K  ^  J^k  2 

f  ’  k-l  \  ^  ,an>  8l<*k"*k-l"Aek>  p(A6k/Ak’Ak-l)  p(V  (45) 

It  is  important  to  note  in  this  expression  that  the  N  (.,.)  term 

’Sc 

is  dependent  only  on  the  measurement  model;  g^O)  is  dependent  only 
on  the  random  phase  model,  and  p(A9^/.,.)  p(A^)  is  dependent  only 
upon  the  symbolling  constellation  (or  encoding  scheme).  Thus  (45)  is 
a  useful  canonical  decomposition  that  is  generally  applicable  to  commun¬ 
ications  problems  involving  additive  independent  noise  and  independent 
increments  phase  processes. 

For  the  (4,4)  diagram  of  Figure  2  we  may  compute  p(A0^/A^,A^  ^) 
as  follows: 


1/4,  A6k  =  6^,  6-j,  05,  37,  i , j  even- 
even  or  odd-odd 

P(AV\  =  Wi a  v  "  { 

1/4,  A6k  =  82,  84,  e6,  8g,  i, j  even- 
odd  or  odd-even  (46) 


22 


It  is  a  straight-forward  matter  to  substitute  these  results  into 
(45)  and  derive  a  path-metric  as  in  (37). 


? 


» 


If 
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VI.  Linear  Performance  Results  and  the  Selection  of  a  Fixed  Lag 
There  is  one  more  simplification  to  be  made:  namely  the  selection 
of  a  depth  constant  kg  such  that  phase-symbol  pairs  may  be  decoded  at  a 
fixed-lag  kg,  thereby  obviating  the  need  to  store  long  survivor  sequences 

A, 

The  idea  is  the  following.  Call  t*le  P^ase  sequence  based  on 

K  * 

measurements  {x^}^.  The  subscript  k/K  indicates  that  i(i,  /v.  depends  on 


k/K 


K+l 


all  measurements  up  to  time  K.  In  general  the  MAP  sequence  {^/K+l^l 
based  on  measurements  to  time  (K+l)  may  differ  from  f^/K^l  at  all 
values  of  1  <_  k  _<  K.  However,  one  expects  that  for  large  K  and  for 
k  _<  K  -  kg,  the  sequences  {^/K^l  anc*  ^2,/K+l^l  not  he  very  differ¬ 

ent  for  a  well  chosen  depth  kg.  In  other  words,  long  survivor  sequences 
have  one  common  trunk  up  to  K-kg,  at  which  point  they  may  diverge  as 
illustrated  in  Figure  6.  Thus  we  may  use  tjij,_k  as  a  final  estimate 
of  iJ)K_k  since  iJ)K_k  ^  for  all  i.  Thus,  as  a  practical 

matter,  one  may  choose  a  depth  constant  kg  such  that  the  sequence  of 
fixed-lag  estimates  ^k_k  /k>  k  =  kg  +  1,  kg  +  2,...,  gives  an  approxi¬ 
mate  MAP  sequence.  Here  >{ik_k  is  the  phase  value,  kg  samples 

back,  in  the  MAP  sequence  based  on  measurements  up  to  time  k.  In  this 
way  phase  values  are  estimated  with  delay  kg  and  only  survivor  sequences 
of  length  kg  must  be  stored. 

How  to  choose  kg?  This  is  a  difficult  question  to  answer  precisely 
because  there  exist  no  analytical  results  for  the  performance  of  non¬ 
linear  phase  trackers  of  the  Viterbi-type .  We  can,  however,  study  the 
filtering  behavior  of  a  related  linear  problem  and  find  how  performance 


varies  with  fixed-lag  kg.  To  this  end,  we  consider  the  problem  of 
tracking  phase  when  there  is  no  data  symboling.  Assume  {ik  }  is  a 

2  J*k 

normal  random  walk  of  the  form  (6)  with  W^:NW  (O.o^).  Let  xk  =  e  +nk 

k 


(n,  }  a  sequence  of  complex  i.i.d.  N  (0,o  )  random  variables.  A  PLL 
Tt  ^  n 

with  gain  for  estimating  is  the  following: 


♦k =  Vi +  kiK!  sln(arg  vii} 


Note  that  this  is  similar  to  (13)  when  there  is  no  data. 
2 

For  a  <<  1  we  approximate  (47)  with 
n 

K +  *k-i  +Ki(arg  \  ~  *k-i> 

When  is  selected  to  be 


K  =  (a2/o2)  [-0.5  +0.5  (l+4o2/a2)!s] 
1  w  n  n  w 


then  (48)  is  the  Kalman  filter  for  the  "linear  observation  model" 

arg  xfc  =  i|>k  +  nk  -+  Xj^.  =  exp[  j  (^k+nk)  ]  (50) 

The  steady-state  filtering  error  PQ  for  this  linear  problem  is  related 
to  as  follows: 

°w  ?0 

K1  2  *  2  • 
o  o 
n  w 


A  general  result  due  to  Hedelin  [12]  for  fixed-lag  smoothing  may 

be  adapted  to  random  walk  smoothing  from  observations  of  the  form  (50). 

The  steady-state  fixed-lag  smoothing  variance  P  at  delay  k  is 

k  0  0 

P.  /a2  =  P  /a2  -  l  G25, 
k0  w  0  w  i=1 


=  PQ/a2  -  G2 (1-G  °)/(l-G2) 

G  =  1  -  K. 


The  infinite-lag  smoothing  variance  is 
?J%  “  P0/ow  -  G2/a-G2) 


In  Figure  7  several  error  expressions  and  asymptotic  forms  are 

2  2  2  2 
plotted  versus  °w/°ni  which  is  a  kind  of  SNR.  For  large  o*/c^,  the 


2  2  2  2  2  -1 
error  variances  P  /o  ,  P1rt/o  and  P  /o  go  as  (o  /d  )  .  For  small 

o  w  10  w  °°  w  w  n 

2  2  2  2  — Jj 

ow/o^,  they  go  as  (a^/o^)  although  infinite-lag  smoothing  offers  6dB 
2  2 

improvement  in  a^/cr  over  zero-lag  smoothing  for  a  fixed  smoothing  vari- 

2  2 

ance.  Over  the  range  of  values  0.01  <_  °w/°n  —  10 »  a  delay  of  kg  =  10 
offers  all  but  1  to  2dB  of  the  theoretically  achievable  gain  from  in¬ 


finite  delay.  In  communication  problems  for  which  random  phase  is  a 

2  2 

significant  effect,  the  ratio  a  /o  is  typically  in  this  range.  Only 

w  n 

2  2 

at  very  small  values  of  a  /o  can  very  large  delays  k_  provide  large 

w  n  U 

performance  gains.  But  in  this  case  there  is  no  real  phase  fluctuation 

problem  for  the  purpose  of  data  decoding,  and  the  gain  is  not  worth  the 

2  2 

large  delay.  Shown  also  in  Figure  7  is  the  Kalman  gain  K  versus  a  /a  . 

w  n 

The  problem  considered  in  Section  IV  is  admittedly  different  from 

the  linear  problem  considered  here.  However,  the  numerical  results 

given  in  Figure  8  for  the  Viterbi  phase  tracker  illustrate  that  the 

performance  gain  to  be  achieved  with  a  fixed- lag  of  kQ  =  10  is  much  as 

predicted  by  the  linear  theory.  Furthermore,  over  the  range  of  values 
2  2 

0.1  <  a  /a  <2,  the  phase  estimator  variance  for  the  Viterbi  phase 
w  n  — 

tracker  operating  with  delay  kg  =  10  is  essentially  equivalent  to  the 
filtering  variance  of  a  Kalman  filter  that  has  access  to  linear  obser¬ 


vations  and  provides  estimates  without  delay.  Performance  is  not 
measurably  degraded  by  the  presence  of  data  which  is  concurrently  de¬ 
coded.  For  the  results  cf  Figure  8,  the  phase  space  was  discretized  to 
m  =  48  values.  Data  transmission  was  8-ary  PSK. 
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VII.  Simulation  Results:  Gaussian  Increments 

For  all  simulation  results  discussed  in  this  section  the  phase 
space  [-ir,ir)  has  been  discretized  to  48  equally-spaced  phase  values  and 
a  Viterbi  algorithm  has  been  programmed  to  solve  the  MAP  sequence  esti¬ 
mation  problem.  The  principle  of  optimality  established  in  Section  V 
has  been  used  to  derive  the  appropriate  path  metric  and  thereby  reduce 
computational  complexity.  The  choice  of  a  fixed-lag  decoding  (or  depth) 
constant  is  kg=10.  Source  symbols  have  been  generated  independently. 

The  random  phase  sequence  has  been  governed  by  the  independent  incre- 

2 

ments  model  of  (2)  with  w.:N(0,aw)  and  initial  phase  uniformly-distrib¬ 
uted  on  [  —it,  n) .  Initial  phase  acquisition  has  been  achieved  by  trans¬ 
mitting  a  preamble  according  to  one  of  the  following  schemes. 

a)  During  a  pre-transmission  period  of  length  N,  the  sequence  of 
transmitted  data  is  known  to  the  receiver.  Thus,  in  the  DBVA  and  VA 
systems,  based  upon  MAP  estimation,  the  Viterbi  algorithm  works  as  a 
pure  phase  estimator  during  this  period.  At  the  end  of  the  preamble, 
the  Viterbi  algorithm  is  turned  into  a  joint  phase-data  MAP  estimator. 

In  the  DDPLL  and  JE  systems,  based  upon  decision-directed  algorithms, 
the  algorithm  is  directed  by  the  true  data  during  the  preamble  period. 

b)  During  the  preamble  period,  identical  (but  unknown)  data  are 
emitted.  This  keeps  the  phase  away  from  severe  sudden  fluctuations, 
and  makes  the  joint  phase-data  estimator  able  to  adequately  acquire  the 
Initial  phase. 

In  our  simulations  the  VA  has  achieved  the  same  data-error  proba¬ 
bility  for  both  methods;  i.e.  its  performance 

has  not  depended  upon  which  learning  procedure  was  used.  On  the  other 
hand,  the  DBVA  has  proved  to  be  sensitive  to  the  learning  procedure. 


2  2 

For  example,  at  SNR  =  20dB,  with  phase  variance  =  4  oq,  for  a 
learning  period  of  N=60  data,  the  number  of  errors  during  an  emitting 
period  of  490  data  values  has  jumped  from  7  for  procedure  a)  -  known 
data  -  to  59  for  procedure  b)  -  constant  but  unknown  data.  Moreover 
the  DBVA  typically  requires  a  longer  learning  period  than  does  the  VA 
(roughly  two  times  longer).  A  value  of  N=50  is  sufficient  for  the  VA, 
while  the  DBVA  needs  N=100  learning  iterations  in  our  simulations.  The 
decision-directed  systems  (DDPLL  and  JE)  work  as  the  VA  in  these  respects 
That  is,  a  preamble  period  of  50  data  values  is  sufficient.  These  data 
may  be  unknown  to  the  receiver,  provided  they  are  kept  constant  (pro¬ 
cedure  b) .  No  degradation  with  respect  to  procedure  a)  results. 

Binary  Symboling:  Shown  in  Figure  9  are  binary  symboling  results  for 

the  VA  when  a2  =  0.01  rad2  (o  =5.7°)  and  SNR  ranges  from  4  to  lOdB. 
w  w 

2 

(Recall  SNR  -  10  log1Q  l/2^) •  For  comparison  the  performance  curves 
for  coherent  binary  orthogonal  and  coherent  binary  antipodal  systems 
are  also  shown.  The  simulation  results  for  binary  orthogonal  symboling 
are  of  no  inherent  interest  in  their  own  right  because  even  fully  coher¬ 
ent  binary  orthogonal  symboling  provides  only  marginal  gains  over  in¬ 
coherent  binary  symboling  at  SNRs  of  practical  interest.  This  point  is 
made  with  curves  1  and  2  of  Figure  9.  However,  the  simulation  results 
for  binary  orthogonal  symboling  serve  to  validate  the  simulation.  The 
simulation  results  for  binary  antipodal  symboling  are  interesting  be¬ 
cause  incoherent  reception  is  not  possible  with  antipodal  symboling. 

The  results  indicate  that  performance  with  the  VA  is  essentially 
equivalent  to  that  of  a  fully  coherent  receiver  -  even  for  a  relatively 

large  value  of  a  . 

w 
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8-PSK:  Shown  in  Figure  10  are  simulation  results  for  8-PSK  when  SNR 

2  9  k  -32 

ranges  from  16-19dB  and  (o  a  )  remains  fixed  at  4.4  x  10  rad  .  The 

w  n 

solid  circles  correspond  to  the  VA  and  the  solid  triangles  correspond 
to  the  markedly  simpler  JE.  Also  shown  on  Figure  1  are  performance 

2 

bounds  for  fully  coherent  8-PSK  and  16-PSK  symboling.  The  values  of 

2  2 

under  investigation  range  from  1.66  to  2.2°  and  the  ratio  a^/ is  very 
small,  ranging  from  0.03  to  0.12.  In  this  case  neither  the  VA  nor  the 
DBVA  provides  significant  improvement  over  the  JE  or  DDFLL.  The  latter 
two  receivers  are  simpler  than  the  DBVA  which,  in  turn,  is  simpler  than  the  VA. 
Therefore  for  such  cases  of  weak  phase  noise,  and  a  simple  symbol  constellation, 
neither  the  VA  nor  the  DBVA  would  be  favored  over  the  JE  or  the  DDPLL. 

16-QASK:  Shown  in  Figures  11,  12  and  13  are  simulation  results  for 
16-QASK  symbols  encoded  according  to  the  (4,4)  CCITT  rule.  The  decoding 

procedures  are  JE,  DDPLL,  DBVA  and  VA,  for  three  distinct  values  of  the 

2  2  2  2 
ratio  aw/°n*  Figure  11  is  concerned  with  a  weak  phase  noise  (aw/an  * 

2  2 

0.25).  Figure  12  is  concerned  with  an  average  phase  noise  (a  /a  =1), 

w  n 

2  2 

and  Figure  13  is  concerned  with  a  large  phase  noise  (a  /a  =»  4) .  We 

w  n 

recall  [1]  that  the  DBVA  performs  some  kind  of  phase  estimation  along  a 
path  that  satisfies 


*. 


n-1  w 


(32) 


using  a  Viterbi  algorithm.  The  DBVA  that  we  have  simulated  is  somewhat 
different  from  Ungerboeck's  DBVA,  in  which  the  number  of  possible  phase 
states  at  each  iteration  is  limited  to  6  or  8.  In  our  simulation  the 


number  of  phase  states  is  not  limited,  thus  avoiding  one  possible  cause 
of  errors  and  improving  the  error  rate,  but  also  increasing  the  compu¬ 
tational  complexity  with  respect  to  [1]. 
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Behavior  of  DDPLL  and  JE  on  CCITT  (4,4)  Constellation:  The  decision- 

directed  algorithms  (DDPLL  and  JE)  have  essentially  the  same  performance 

as  shown  in  Figures  11-13.  The  DDPLL  is  superior  to  the  JE  by  only  0.5dB. 

The  slight  inferiority  of  the  JE  is  largely  conpensated  by  the  fact  that 

the  complex  gain  of  the  JE  can  also  correct  rapid  gain  fluctuations  in 

the  channel.  We  emphasize  that  the  curves  of  the  DDPLL  and  JE  are 

biased  and  cannot  be  trusted  just  as  they  are,  because  of  the  occurrences 

of  very  large  bursts  of  errors  at  relatively  high  error  probabilities. 

When  such  bursts  have  occurred  in  the  simulation  runs,  they  have  been 

2  2 

withdrawn  from  the  error  rate  computation.  For  instance,  with  a  =  0.25o 

w  n 

_2 

and  SNR  =  17  dB,  at  an  error  probability  on  the  order  of  10  ,  between 

one  fourth  and  one  third  of  the  simulation  runs  (with  length  500  data 
values)  have  exhibited  bursts  of  about  a  hundred  errors.  In  the  simula¬ 
tions,  the  bursts  began  to  occur  at  SNR  =  18dB,  21.5dB,and  26dB  for 
2  2 

a  /a  *  0.5,  1  and  4  respectively.  This  corresponds  to  a  value  of  a  such 
w  n  w 

that  4ow  ranges  between  11.5°  and  20°.  The  phenomenom  of  error  bursts 
can  be  explained  as  follows:  because  the  phase  increment  is  Gaussian  it 
will  occasionally  reach  the  value  40^.  If,  at  the  same  time  the  noise 
is  relatively  large,  the  angle  between  the  observed  data  and  the  trans¬ 
mitted  symbol  will  exceed  the  value  22,5°  that  corresponds  to  the  angular 
threshold  for  an  error  in  the  16-point  CCITT  diagram  (see  Figs.  2d  and 
4a).  No  type  of  decision-directed  phase  estimator  can  correct  such  an 
error.  Therefore  the  phase  estimate  will  become  incorrect  (by  a  shift 
of  ±45°),  causing  a  burst  of  errors.  Moreover,  as  expected,  the  error 
probability  at  which  bursts  of  errors  occur  decreases  as  the  phase 

fluctuations  increase,  making  the  receiver  less  and  less  reliable.  For 

-3  2  2 

instance,  at  the  error  probability  of  about  10  with  a  -  0.25  cr  no 

w  n 
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bursts  occurred;  on  the  other  hand,  at  the  error  probability  of  about 
-4  2  2 

10  with  a  m  4o  ,  we  have  obtained  one  500  sample  run  out  of  40  such 
w  n 

runs  that  was  a  burst.  For  the  other  19500  samples  of  the  other  39  runs, 
no  error  was  observed. 

With  respect  to  burst  phenomena,  the  DDPLL  and  JE  behave  similarly. 

Behavior  of  DBVA  and  VA  on  CCITT  (4,4)  Constellation:  The  performance 

of  the  VA  is  superior  to  that  of  the  DBVA.  The  gain  achieved  by  the  VA 

over  the  simpler  DBVA  is  monotone  increasing  in  the  ratio  of  phase 

2  2 

fluctuation  variance  o  to  additive  noise  variance  o  .  While  there  is  no  gain 

**  n 

when  a^/o^  ■  0.25,  the  gain  is  ldB  for  a^/a^  *  1  and  2dB  for  o^/a^  «  4. 
w  n  w  n  w  n 

Both  systems  perform  better  than  the  DDPLL  or  JE,  the  Improvement  again 

2  2 

being  a  monotone  increasing  function  of  a  /a  . 

w  n 

A  very  important  point  is  that  the  use  of  either  of  the  two  MAP 
phase  estimators  precludes  the  occurrence  of  error  bursts.  The  errors 
seem  to  be  grouped  by  two  or  three  and  no  error  multiplicastion  occurs 
since  the  phase  estimator  is  not  decision-directed .  Thus  such  MAP 

sequence  estimators  can  be  used  even  at  high  error  probabilities  on  the 

-2  -1 
order  of  10  or  10 

Comparison  between  MAP  and  Decision-Directed  Phase  Estimators:  The 

improvement  that  can  be  gained  by  using  any  type  of  MAP  estimator  for 

phase  rather  than  a  simple  decision-directed  algorithm  is  again  an 

2  2 

increasing  function  of  o^a^.  Figure  11  shows  that  ldB  only  is  gained 

by  the  DBVA  and  the  VA  over  the  DDPLL  if  ■  0.25  .  This  gain  is 

w  n  ° 

realized  at  a  high  computational  price.  For  phase  fluctuations  and 

2  2 

additive  noise  of  the  same  importance  (aw/an»l) ,  the  VA  outperforms  the 
DDPLL  by  3  dB  (see  Fig.  12),  but  the  gain  is  reduced  to  2dB  for  the  simpler 
DBVA.  For  large  phase  fluctuations,  the  gain  is  important.  For  instance 


2  2 

Figure  13  shows  that  the  VA  outperforms  the  DDPLL  by  5dB  when  o  /a  =4. 

w  n 

In  addition  the  VA  brings  the  insurance  that  no  burst  of  errors  can  occur, 
even  for  very  poor  SNR  and  large  phase  fluctuations. 

2  2 

Sensitivity  to  imperfect  knowledge  of  0/0:  It  is  easily  seen  in  (18) 
_ w  n 

or  (37)  that  the  only  parameter  required  in  order  to  proceed  with  the  VA 

algorithm  is  the  ratio  of  phase  variance  to  additive  noisepower.  The 

same  holds  for  the  DDPLL  whose  optimal  gain  depends  on  this  ratio 

(see  (49)),  and  for  the  JE  whose  step-size  p  (see  (14))  is  to  be  kept 

close  to  K^,  but  smaller,  provided  the  data  diagram  has  unit  power.  As 

2 

for  the  DBVA  it  requires  only  the  knowledge  of  o^  in  order  to  determine 

the  number  m  of  discretized  phase  levels.  Thus  an  important  feature  of 

2  2  ? 

each  system  is  its  sensitivity  to  an  imperfect  knowledge  of  a  /a  (or  a^) 

w  n  w 

2 

because  firstly  can  vary  with  time  and  secondly  the  actual  phase  can 

fluctuate  according  to  a  statistical  model  that  is  different  from  the 

one  expected.  The  less  sensitive  the  system  is  to  the  knowledge  of 
2  2  2 

a  /a  (or  a  ),  the  more  robust  it  is. 
w  n  w 

a)  Sensitivity  of  the  decision-directed  systems.  Let  us  denote 

2  2 

avfan  by  a.  The  function  K.^(u)  that  gives  the  optimum  loop-gain  of  the 
DDPLL  is  sketched  in  Figure  14.  It  Is  quite  flat  except  for  a  very 
close  to  zero  (e.g.  a  <  0.2). 

Now  the  case  a  <<  1  is  of  no  real  interest  for  the  purpose  of  this 
paper.  Indeed  it  has  been  seen  previously  that,  in  this  case,  no  MAP 
phase  estimator  is  worth  being  worked  out.  Moreoever  any  type  of 
(reasonable)  phase  estimator  will  perform  satisfactorily.  When  a  is 
not  negligible,  K^(a)  is  slowly  varying.  For  example  K^(l)/K^(0.25) 

■  1*59,  and  K^(4)/K^(l)  «*  1.34.  Thus  the  value  K^(l)  *  0.62  for  the 
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DDPLL  gain  is  correct  for  a  large  range  of  values  of  a.  This  fact  is 
largely  confirmed  by  the  simulations.  However,  due  to  the  risk  of 
error  multiplication  that  increases  very  rapidly  with  K^,  it  should 
rather  be  set  to  the  lower  bound  K1 (am^n)  corresponding  to  the  smallest 
a  that  can  be  expected,  rather  than  to  an  average  value  Kj^(®ave),  which 
will  sometimes  be  too  large  and  bring  error  bursts.  Thanks  to  this 
precaution,  the  DDPLL  is  insensitive  to  a.  It  is  a  robust  system. 

The  robustness  of  the  JE  is  also  excellent.  This  fact  was  checked 
on  numerous  computer  simulations:  as  a  function  of  the  step-size  |i 
the  error  probability  P(E;u)  exhibits  a  minimum  which  is  very  flat,  as 
sketched  in  Figure  15.  The  range  where  the  minimum  is  reached  does  not 
depend  critically  upon  a.  A  value  such  as  p=0.4  corresponds  to  the 
minimum  of  error  probability  for  a  in  the  range  [0.25-1],  and  for  a  unit 
energy  data  diagram. 

b)  Sensitivity  of  the  MAP  phase  estimators.  The  VA  sensitivity  to 

imperfect  knowledge  of  a  has  been  tested  in  our  computer  simulations.  It 

appears  that  the  VA  performance  is  not  appreciably  degraded  by  an  error 

of  ±6dB  for  a.  Hence  the  VA  robustness  is  at  least  as  good  as  that  of 

the  decision-directed  algorithms. 

On  the  other  hand  the  DBVA  robustness  has  turned  out  to  be  poor. 

For  instance,  with  SNR=21dB  and  a=4,  the  DBVA  is  supposed  to  work  with 

m  "  —■  m  50  phase  levels.  If  only  45  levels  are  used,  corresponding  to 
w 

a  0.9dB  error  for  a,  then  the  error  probability  is  increased  by  a 

factor  of  2.  In  fact,  as  a  function  of  m,  P(E;m)  exhibits  a  minimum,  but  it 

is  a  sharp  minimum.  This  poor  robustness  can  be  understood  by  noting 

2  2 

that  in  the  DBVA,  the  path  metric  is  not  a  function  of  a  ■  °w/an»  ^ut  on^y 
2 

of  o^.  This  may  be  one  of  the  main  drawbacks  of  the  DBVA.  This 


k 
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observation  suggests  that  one  should  implement  the  DBVA  with  a  conserva¬ 
tively  large  number  of  phase  levels  to  provide  robustness.  This  increases 
complexity. 
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VIII.  Simulation  Results:  Bounded-Increments  Phase  Jitter 
For  all  simulation  results  of  this  section  the  phase  space  [  — n ,  it) 
has  been  discretized  to  32  equally-spaced  phase  values  and  a  VA  has 
been  programmed  to  solve  (17) .  The  assumed  increment  density  h(w)  is 
the  uniform  density 

,  -a  <_  w  <  a  ;  2a  «  2n/16 

h(w)  ■  {  (33) 

0  ,  otherwise 

The  corresponding  discrete  transition  density  for  use  in  the  path 
metric  is 

1/3  ,  <}>k-<J>k._1  -  -ir/16  ,  0  ,  n/16 

f(*k/W  "  <  (34) 

0  ,  otherwise 

The  resulting  VA  is  related  to  the  class  of  so-called  M-algorithms 
in  which  all  survivors  are  saved,  but  only  M  (in  this  case  3)  candi¬ 
date  originator  states  are  allowed.  This  significantly  reduces  calcu¬ 
lations  and  results  in  an  algorithm  similar  in  spirit  to  the  DBVA  of 
[1],  Still,  however,  phase  is  tracked  only  on  [-tt,tt)  rather  than  on 
(-“.“>)• 

Source  symbols  have  been  generated  independently  from  a  4-PSK 
alphabet  and  used  to  dif ferentially-encode  phase  according  to  a  Gray 
code.  The  random  phase  sequence  has  been  generated  in  ways  to  be  dis¬ 
cussed  below. 

Markov  Phase  with  Non-Gaussian  Increments:  Here  the  phase  is  generated 

according  to  (6)  with  h(w)  given  by  (33).  Thus  the  algorithm  is  matched 
to  the  actual  phase  sequence.  Shown  in  Fig.  16  are  performance  results 
for  the  VA  and  for  the  JE.  The  VA  outperforms  the  JE  by  1.5dB  over  the 
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range  lOdB  £  SNR  <  15dB.  The  probability  of  error  is  "probability  of 
bit  error." 

Sinusoidal  Phase  Jitter:  Here  the  phase  jitter  is  sinusoidal  fsee 
(4))  with  uniformly-distributed  initial  phase  and  frequency  v.  The 
frequency  is  chosen  such  that  vA  =  1/24,  corresponding  to  a  transmission 
rate  of  4800  b/s  with  baud  rate  1/A  =  2400Hz  and  jitter  frequency  v  =-100Hz. 
The  runs  are  2000  to  10,000  steps  long,  corresponding  to  4000  to  20,000 
transmitted  bits.  The  peak-to-peak  phase  deviation  is  20°  or  60°.  For 
these  experiments  the  VA  outperforms  the  JE  by  1.5-1.7dB.  This  gain  is, 
of  course,  achieved  at  a  high  price  in  complexity. 

Comparison  of  the  JE  and  VA:  In  the  simulations  reported  above,  the  ratio 
2  2 

a  *  °w/°n  ranges  from  0.02  to  0.81,  that  is  from  small  to  average  values. 

No  burst  of  errors  has  ever  been  observed 
for  the  JE.  This  is  due  to  the  fact  that  the  phase  increment  is  always 
bounded  as  appears  in  (4)  and  also  (33) .  The  bound  is  much  smaller  than 
the  angular  distance  between  adjacent  data.  Thus  there  is  no  risk  of 
a  +90°  slip  fcorresponding  to  the  4-PSK  diagram)  in  the  JE  phase  estimation. 
Hence  the  errors  will  be  scattered  rather  than  grouped,  and  no  error 
multiplication  phenomenon  can  happen. 

Owing  to  this  consideration,  to  the  fact  that  the  VA  outperforms  the 
JE  by  only  1.5dB, and  to  the  complexity  of  the  VA,  a  practical  system 

will  implement  the  JE  (or  DDPLL)  rather  than  the  VA  (or  DBVA) ,  in  the 
case  of  bounded  increment  phase  jitter  and  a  simple  symbol  constellation 


such  as  the  4-PSK. 
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IX.  Conclusions 

We  have  derived  a  principle  of  optimality  for  phase-amplitude 
encoded  symboling  that  allows  one  to  simultaneously  tracx  random  phase 
and  decode  data  symbols  using  the  VA  derived  in  [8]  and  [9].  The  VA 
is  designed  for  a  random  walk  phase  process,  a  very  severe  type  of 
phase  process.  In  such  a  process  there  exists  the  possibility  of 
large  phase  jumps.  The  VA  gives  excellent  performances. 

In  order  to  reach  conclusions  about  the  type  of  phase  estimation 
that  should  be  used  for  given  type  of  phase  fluctuations,  performance 
comparison  of  the  VA  with  two  simple  decision-directed  phase  estimators 
namely  the  JE  of  [3]  and  the  DDPLL  of  [5],  and  with  the  DBVA,  have  been 
thoroughly  investigated  on  computer  simulations,  with  various  data  dia¬ 
grams.  They  indicate  that  the  choice  among  the  four  systems  is  to  be 
made  according  to  four  parameters: 

(i)  the  error  probability  P(E)  at  which  the  system  is  to  be  used; 

2  2 

(ii)  The  relative  importance  a  *  a  /a  of  phase  fluctuations  with 

w  n 

respect  to  additive  noise; 

(iii)  The  complexity  C  that  is  technologically  feasible  and  accept¬ 
able; 

(iv)  the  maximum  phase  increment  h$mnir  that  is  to  be  expected,  as 
compared  to  the  angular  distance  between  points  of  the  data 
diagram. 


case  1 

case  2 

JE  or 
DDPLL 

JE  or 
DDPLL 

small 

JE  or 
DDPLL 

see 

Table  2 

small 

case  4 

VA  or 

large 

see 

Table  2 

VA  or 
DBVA 

large 

_ i 

DBVA 

DBVA 

Table  1  Table  2 

The  choice  between  the  two  decision-directed  phase  estimators, JE 

or  DDPLL, is  irrelevant  for  the  matters  discussed  in  this  paper.  It 

appears  in  Tables  1  and  2  that  the  VA  and  DBVA  are  preferred  when  a,  P 

and  A$  are  large.  The  comparison  between  these  two  MAP  phase  esti- 
max 

mators  shows  that  the  VA  is  more  robust,  has  a  smaller  learning  period 
and  outperforms  the  DBVA  by  2dB  or  more  when  a  is  at  least  equal  to  4 . 
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Appendix:  Monotonicity  of  Folded  Normal  and  Cauchy  Densities 

There  are  many  choices  for  the  phase  increment  density  h(x)  that 
are  physically  interesting  and  mathematically  tractable.  Two  of 
particular  interest  are  the  normal  density  and  the  Cauchy,  the  latter 
being  useful  in  the  modelling  of  "heavy-tailed"  behavior.  When  folded 
around  the  unit  circle  according  to  (8)  these  densities  yield  transition 
densities  which  achieve  their  maximum  at  (J>^  -  ^  =  0  and  decrease 

monotonically  on  the  interval  0  <_  <^  -  ^  i  — 

Consider  first  the  Cauchy  case 

g  (x)  =  1  j 

lc=-<*>  a  +(x+k2m) 

According  to  Poisson's  summation  formula  [13],  this  may  be 
written 


g2  00 


jkx 

e 


=  ~  (1-e  2a) (l-2e  acos  x  +  e  2a)  ^  .  (A-2) 

L  it 

This  function  achieves  its  maximum  value  at  zero  and  decreases  mono¬ 
tonically. 

In  the  normal  case 


g  (x)  =  £  (2iro2)  **  exp{-(xi2kr)2/2a2} 

k=-” 


(A- 3) 


Again,  by  Poisson's  summation  formula 


g  00  “  E  (2it)  *  exp{ jkx-k2o2/2} 
1  k=-~ 


(A-4) 


//2 

This  infinite  sum  goes  by  the  name  J^(x,q=e~a  '  )  in  the  theory 
of  Jacobian  elliptic  functions  and  theta  functions  [15].  The  theta 


function  J^Cx.q)  is  known  to  be  monotone  decreasing  on  the  interval 
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